stacking from Tags: 
Clustering Bookmarks around a Theme 



Arkaitz Zubiaga 

Queens College and Graduate Center 
City University of New York 

New York, NY USA 
arkaitz@zubiaga.org 



ABSTRACT 

Since very recently, users on the social bookmarking service 
Delicious can stack web pages in addition to tagging them. 
Stacking enables users to group web pages around specific 
themes with the aim of recommending to others. However, 
users still stack a small subset of what they tag, and thus 
many web pages remain unstacked. This paper presents 
early research towards automatically clustering web pages 
from tags to find stacks and extend recommendations. 

1. INTRODUCTION 

Social tagging has become a powerful means to organize re- 
sources facilitating later access [I]. On these sites, users 
can label web pages with tags, facilitating future access to 
those web pages both to the author and to other interested 
users [sjjTj. Very recently, Delicious.com introduced a new 
dimension for organizing web pages: stacking. With stacks, 
users can group the web pages that might be of interest 
for a specific community, e.g., Valentine's Special, or UNIX 
and Programming Jokes. Stacks can be very useful for those 
who are looking for help or information on a specific matter. 
With stacks, users are providing a 2-dimensional organiza- 
tion of web pages that is complemented with tags, as shown 
by the example in Table [l] However, tagging activity still 
clearly exceeds the stacking activity, and many web pages 
are tagged but not stacked. Moreover, all the web pages 
tagged before the new feature was introduced are not as- 
sociated with stacks. Thus, finding a way to infer missing 
stacks from all those tags would be helpful to recommend 
more groups of web pages to communities, or to suggest 
adding web pages to any user stack. Different from previ- 
ous research on clustering and classifying tagged resources, 
which evaluated using the Open Directory Project ^ |5] 
or from manually built categorizations stacks provide 
a rather ad hoc ground truth to evaluate with. This paper 
describes early research in this direction, presenting prelim- 
inary work on automatically clustering web pages from tags 
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Stack #1 Stack #2 

URLi URL2 URL3 URL4 URL5 URLg URL7 
tagsi tagS2 tagS3 tagS4 tagss tagsg tagsy 

Table 1: Example of a user's tags and stacks. The 
user tagged 7 URLs, with 5 of them in 2 stacks. 



to find stacks as users would do. Preliminary experiments 
suggest that using tags can help reach high performance 
clustering. 

2. DATASET 

We collected the tagging activity for 3,635 Delicious users 
in October and November 2011. This subset includes all the 
users who created at least a stack in this timeframe. During 
this period, those users tagged 182,510 web pages, creating 
5,214 stacks. Out of those web pages, 45,196 (24.8%) were 
stacked while 137,314 (75.2%) were left out of stacks. Also, 
a large set of users who are not included in our dataset 
are tagging web pages, while they are not creating stacks. 
Going into further details on the tagging activity in and 
out of the stacks, we observe that, on average, 30.1% of 
the tags contained in stacks are also used out of the stacks. 
This suggests that there is not specific vocabulary for stacks, 
but users share vocabulary with web pages out of stacks. 
Moreover, there is just a small subset of 22.5% of the stacks 
that have a common tag in all their underlying web pages. 
Hence, most users do not use an exclusive tag that refers 
to the stack. This motivates our study on the automatic 
clustering of web pages from tags with the aim of finding 
stacks that approach to those created by users. 

3. EXPERIMENTS 

We used Cluto rbr |3j (k-way repeated bisections globally op- 
timized) to find clusters from tags. Cluto rbr conveniently 
fits with the present task since, in practice, it always gen- 
erates the same clustering solution for a certain input data. 
As the main parameter, this algorithm requires as an input 
the number of clusters to generate, which is known as K. 
We used values ranging from 2 to 10 for K, as a prelimi- 
nary approach that allows us to evaluate and understand 
how the number of created clusters affects the solution. We 
set the rest of the parameters to their default values. Upon 
these settings, we got the resulting clusters for all the web 
pages saved by each user, and compare the results to the 



stack(s) created by the user. For each run on a stack, we 
computed the precision, recall and Fl values, and got the 
macroaveraged values for all the stacks. 
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Figure 1: Macroaveraged Fl, P and R for K- 
dependent Cluto runs, ranging from 2 to 10 for K. 



Figure[T]shows precision, recall, and Fl values for K-dependent 
Cluto runs. Precision and recall values considerably vary de- 
pending on the selected K value. Creating a few big clusters 
improves recall, while creating many small clusters improves 
precision. However, Fl values remain very similar while K 
changes. Regardless of the value selected for K, the cluster- 
ing gets Fl above 0.6. Hence, the selection of the value for 
K mainly conditions that the results get affected by either 
precision or recall, depending on the preference. 
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Figure 2: Comparison of Fl values for K-dependent 
Cluto runs as compared to benchmark approaches. 



values for our K-dependent runs as compared to 3 bench- 
mark methods: (1) a baseline approach that randomly cre- 
ates the clusters, i.e., randomly generating K clusters of 
equal size, (2) an intermediate approach that randomly se- 
lects the value of K for each user, i.e., the average of multiple 
runs using random K values, and (3) the ideal upper-bound 
performance by choosing the optimal K value for each user 
stack. These results show that using tags to find stacks 
clearly outperforms a random approach, doubling the per- 
formance in many cases. This encourages the use of tags to 
perform this task in an effective way. Moreover, even though 
this paper does not explore how to find a suitable K for 
each stack, the upper-bound performance based on optimal 
K values shows that tags can reach very high results. An ap- 
propriate selection of K could yield clusters approaching to 
0.8 performance in terms of Fl. It also clearly outperforms 
the random selection of K, encouraging to perform further 
research in a way of looking for a suitable K for each user. 

4. CONCLUSIONS 

This work describes early research for a work-in-progress on 
a novel feature of social bookmarking systems: stacking. To 
the best of our knowledge, this is the first research work that 
deals with stacks. We have shown that the use of tags to find 
stacks that resemble to those created by users scores high 
performance results above 0.6 in terms of Fl. Moreover, 
choosing the right parameters for each stack to be created 
can substantially improve performance by scoring nearly 0.8. 
As a preliminary work, these results encourage performing 
further study that helps make a decision on the selection 
of parameters that improves performance. Future work in- 
cludes studying behavioral patterns of users such as tagging 
vocabulary towards finding the right parameters for each 
user. The promising results by using tags to discover stacks 
also suggest further research looking for groups of related 
tags both to individual users and communities. 
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Figure [2] complements the above results by showing the Fl 



